Prevalence of child undernutrition measures and their spatio-demographic inequalities in Bangladesh: an application of multilevel Bayesian modelling

Micro-level statistics on child undernutrition are highly prioritized by stakeholders for measuring and monitoring progress on the sustainable development goals. In this regard district-representative data were collected in the Bangladesh Multiple Indicator Cluster Survey 2019 for identifying localised disparities. However, district-level estimates of undernutrition indicators - stunting, wasting and underweight - remain largely unexplored. This study aims to estimate district-level prevalence of these indicators as well as to explore their disparities at sub-national (division) and district level spatio-demographic domains cross-classified by children sex, age-groups, and place of residence. Bayesian multilevel models are developed at the sex-age-residence-district level, accounting for cross-sectional, spatial and spatio-demographic variations. The detailed domain-level predictions are aggregated to higher aggregation levels, which results in numerically consistent and reasonable estimates when compared to the design-based direct estimates. Spatio-demographic distributions of undernutrition indicators indicate south-western districts have lower vulnerability to undernutrition than north-eastern districts, and indicate significant inequalities within and between administrative hierarchies, attributable to child age and place of residence. These disparities in undernutrition at both aggregated and disaggregated spatio-demographic domains can aid policymakers in the social inclusion of the most vulnerable to meet the sustainable development goals by 2030. Supplementary Information The online version contains supplementary material available at (10.1186/s12889-022-13170-4).

The recent MICS conducted in 2019 covered all 64 districts in the survey design for identifying localised disparities as well as the most vulnerable for social inclusion [1]. However, district level estimates of three undernutrition indicators have remained largely unexplored and widely unpublished. Availability of the large dataset has given scope to investigate the disparities in the prevalence of children undernutrition at district level as well as at the corresponding spatio-demographic detailed level domains. So this study aims to estimate the prevalence of stunting, wasting and underweight for 1280 small domains which are cross-classified by children's sex (female and male), five age-groups (0-11, 12-23, 24-35, 36-47, and 48-59 months), place of residence (rural and urban) and 64 districts by applying an extension of small area estimation method using multilevel modelling. This approach has explored spatio-demographic variation at both division and district levels (first and second administrative units respectively). These disaggregated level prevalence of children undernutrition indicators as well as the observed spatio-demographic inequalities might help policy makers to understand the spatial disparity in the undernutrition vulnerability and how the vulnerability is distributed unevenly throughout Bangladesh. To the best of our knowledge, such investigation on the dynamics of spatio-demographic disparities in child undernutriton has not been yet done in Bangladesh. So we expect this understanding will provide immense benefit for policy-making in the race of meeting SDG goals by 2030.

S.1 Data inputs
In the Bangladesh Multiple Indicator Cluster Survey (MICS) 2019, a total of 24,686 children under 5 years of age were recorded for measuring weights and heights to calculate their anthropometric indices for height-for-age, weight-forage and weight-for-height expressed in standard deviation units (z-scores) from the median of the reference population. According to MICS 2019 report, about 2.8%, 4.5%, and 4.7% children were excluded from calculations of the weightfor-age, height-for-age, and weight-for-height indicators, respectively due to either missing weight and height measurements or measurements were out of normal range [1].
Though the MICS 2019 provides a large data set representative at the district level, the sample size of the targeted cross-classified domains are comparatively very small to estimate undernutrition indicators with enough precision. The sample size for the rural sampled domains varies over about 11-60 with a mean of about 28, while for the urban domains it ranges between 1-66 with a mean of 6.5. Notably, there are about 14 domains with no observation for all the three indicators. The comparison of domain-specific sample size with their respective population size obtained from Census 2011 (shown in Figure S.1 in Supplementary file) also indicates sample size for most of the urban domains are comparatively very small than those of rural domains having under five children below 10,000. Consequently, it is expected the estimates of child undernutrition will be comparatively highly inconsistent for the urban domains.
Distribution of the number of children in the census population and the survey data set for the 1280 cross-classified domains of 64 districts, rural-urban place of residence, children's sex and 5 age-groups (0-11, 12-23, 24-35, 36-47, and 48-59 months) are plotted in Figure

S.2 Generalized variance function (GVF) model
The generalized variance function (GVF) model developed under square-root scale of standard errors are shown in Table S.1, Table S

S.3 Multilevel Bayesian model formation
Under the considered multilevel Bayesian model, the fixed effects parameters are assumed to follow a weakly informative prior distributions as p(β) = N (0, 100I). The scaled-inverse Wishart and half-Cauchy priors are used for the standard deviation parameters when unstructured [2] and diagonal [3] covariance structure are assumed respectively.
A number of multilevel models structured as (1) in the main text were fitted using Markov Chain Monte Carlo simulation method for each of the undernutrition indicators in R [4] using package mcmcsae [5]. Two model performance measures, namely the Widely Applicable Information Criterion or Watanabe-Akaike Information Criterion (WAIC) [6,7] and the Deviance Information Criterion (DIC) [8] are used to compare models developed with same input estimates. The selected best multilevel model for each of the indicators was run finally with 2000 iterations so that longer simulations provide the Gelman-Rubin potential scale reduction factor below the recommended 1.10 value for all model parameters and model predictions at convergence [9].

S.4 Model assessment
Information criteria the Widely Applicable Information Criterion or Watanabe-Akaike Information Criterion (WAIC) [6,7] and the Deviance Information Criterion (DIC) [8] were utilized to find the best model among the models developed with same input estimates for each of the three undernutrition indicators. As model diagnostics, however, four discrepancy measures (i) relative bias (RB), (ii) absolute relative bias (ARB), (iii) relative reduction of the standard errors (RRSE), and the ratio of coefficients of variation (CV) denoted by CVR are calculated by comparing the model-based estimates denoted byθ d with the corresponding direct estimatesŶ d for domain d to evaluate how the multilevel models perform. Hereafter model-based and design-based estimates are denoted by SAE and DIR respectively. The RB and ARB defined by is SE ofθ d ) indicates how much accuracy has gained by the SAE estimates compared to the DIR estimates relative to the SAE estimates. At the most disaggregated level, the values of RRSE are expected to be higher than those at the higher aggregation level due to a greater chance of small (and zero) samples. The fourth measure CVR helps to examine the relative SEs of SAE estimates, their CVs are compared to those of DIR estimates. The values of CVR less than 100% indicates that the SAE estimates provide better relative accuracy than the DIR estimates. All these measures are expressed in percentage in Table S.4. The measures are calculated at different (i.e., higher and lower) aggregation levels such as division, district, division×age, district×age, district×age×residence, and district×age×residence×sex.
For all the indicators, it is observed that, on one hand, the values of RB, ARB and RRSE increase with disaggregation levels, while the CV ratios decrease. These are expected since the direct estimates are more reliable and consistent at the higher aggregation level than the disaggregation level. On the other hand, the SAE estimates are as consistent as the DIR estimates at the higher aggregation level (there is little difference between the measures when compared to the direct estimates). More importantly, however, at the disaggregated level SAE estimates are more accurate than the DIR estimates. The SAE estimates appear to provide more gains for the cross-classified domains of district. The highest gains, in terms of RRSE and CVR, are observed at the most detailed level at which the multilevel models are developed. Figure S.2 in Supplementary file also supports that the developed multilevel models provide approximately unbiased and consistent estimates with reasonably better CVs at the most detailed level, particularly for the urban domains. This does make intuitive sense since it highlights the importance of the model estimation having more gains at the detailed disaggregated level.
At higher aggregation levels, the values of RB and ARB are found slightly higher for stunting and underweight compared to wasting, which indicates that the SAE estimates provide slightly higher estimates than the DIR estimates (only about 3% of the DIR estimates). The opposite pattern of ARB, at the detailed level, indicates that the SAE estimates are slightly overestimated for wasting than stunting and underweight. The purported reason for this is that the DIR estimates of wasting for many detailed level urban domains are too volatile shown in Figure S.2. Figure S.2 also shows how the developed multilevel models provide reasonable estimates for the domains with zero and one DIR estimates as well as zero sample size. For these domains, the huge uncertainty mean that the SAE estimates greatly improve upon the DIR estimates through using cross-sectional and spatial information from other domains.
At the detailed level, the multilevel model-based SAE estimator is expected to outperform the DIR estimator in terms of RRSE and CVR shown in Table  S.4. To show how the models provide reliable estimates by bringing strength from similar domains and spatial location, the SAE estimates of stunting (with 95% CI) for the detailed level domains under eight districts (Barguna, Bandarban, Dhaka, Sherpur, Khulna, Sirajganj, Dinajpur, and Sunamganj) as the representatives of eight divisions as well as some data characteristics such as domains with zero (and one) estimates, and the degree of undernutrition vulnerability. Model-based estimates with their 95% CI (coloured dots and error-bar lines) are plotted along with the DIR estimates (black circular dots) in Figure ??. All the considered districts except Dhaka have domains with zero estimates, where SAE estimators provide reasonable estimates. Some districts (Sherpur, Satkhira, Sirajganj, and Sunamganj) have one or more urban domains with the DIR estimates of one (sample size ranges 1-2 for most cases). The developed multilevel model improves on these inconsistent estimates and provided reasonable estimates by bringing strength from the relevant domains, particularly in the highly vulnerable age-groups 12-23 and 24-35 months. The SAE estimates follow the trend of comparatively more consistent DIR estimates, as for example most of the urban domains of Dhaka district. The performance of the developed multilevel models for wasting and underweight are very similar and so the similar comparisons for all the indicators are shown by division in Supplementary Figures Table S.4 Mean of relative bias (MRB, in %), mean of relative absolute bias (MARB, in %), mean of relative reduction of the standard errors (MRRSE, in %), and mean of CV ratios (CVR, in %) at different aggregation levels for the multilevel models of stunting, wasting and underweight.

S.5 District level prevalence of child undernutrition
The performance of model-based estimators (SAE) at the detailed level has been examined by plotting the prevalence level, standard errors (SE) and coefficient of variation (CV%) against those estimated by the design-based direct estimators (DIR) in Figure S.2 for all the three indicators stunting, wasting and underweight. Summary statistics of model-based estimates of stunting, wasting and underweight at the district (D), district-sex (DS), district-age (DA), district-residence (DR), district-residence-sex (DRS), district-residenceage (DRA), and district-residence-age-sex (DRA) along with their CVs are shown in Table S.5. District level prevalence of stunting, wasting and underweight estimated by design-based (DIR) and model-based (SAE) estimators along with the corresponding SEs and CVs expressed in percentage are shown in Table S , and district-residence-age-sex (DRAS).

S.9 Detailed level prevalence of Stunting
The detailed district-residence-age-sex level stunting estimated by SAE and DIR estimators are plotted in Figures